suggested workflow

Gelman, A. et al, ‘Bayesian Workflow’, 2020, arXiv:2011.01808

suggested workflow

a book in 2025? - https://sites.stat.columbia.edu/gelman/workflow-book/

inspection PoD

looking for damage

inspection PoD: prior checks

\[\begin{align} \textrm{indication} &\sim \text{Bernoulli}(p_i) \\ \\ \textrm{logit}(p_i) &= \alpha + \beta_{d} \times d \\ \end{align} \]
  • We want to infer: \(\alpha\), and \(\beta_{d}\)
  • We can add priors to:
    • align plausible outcomes with scientific knowledge
    • provide regularisation (and improve predictive performance)
  • …but how can we select them?

inspection PoD: prior checks

load some data

Code
library(tidyverse)

poi_data <- read_csv("../../data/poi.csv")
head(poi_data, 3)
# A tibble: 3 × 4
  anomaly_id depth_mm length_mm indication
       <dbl>    <dbl>     <dbl> <lgl>     
1          1        1         3 FALSE     
2          2        2         3 FALSE     
3          3        3         3 FALSE     
Code
import polars as pl

poi_data = pl.read_csv("../../data/poi.csv")
poi_data.head(3)
shape: (3, 4)
anomaly_id depth_mm length_mm indication
i64 f64 f64 bool
1 1.0 3.0 false
2 2.0 3.0 false
3 3.0 3.0 false
Code
using CSV, DataFrames

poi_data = CSV.read("../../data/poi.csv", DataFrame)
100×4 DataFrame
 Row │ anomaly_id  depth_mm  length_mm  indication
     │ Int64       Float64   Float64    Bool
─────┼─────────────────────────────────────────────
   1 │          1       1.0        3.0       false
   2 │          2       2.0        3.0       false
   3 │          3       3.0        3.0       false
   4 │          4       4.0        3.0       false
   5 │          5       5.0        3.0       false
   6 │          6       6.0        3.0        true
   7 │          7       7.0        3.0       false
   8 │          8       8.0        3.0        true
  ⋮  │     ⋮          ⋮          ⋮          ⋮
  94 │         94       4.0       30.0        true
  95 │         95       5.0       30.0        true
  96 │         96       6.0       30.0        true
  97 │         97       7.0       30.0        true
  98 │         98       8.0       30.0        true
  99 │         99       9.0       30.0        true
 100 │        100      10.0       30.0        true
                                    85 rows omitted
Code
first(poi_data, 3)
3×4 DataFrame
 Row │ anomaly_id  depth_mm  length_mm  indication
     │ Int64       Float64   Float64    Bool
─────┼─────────────────────────────────────────────
   1 │          1       1.0        3.0       false
   2 │          2       2.0        3.0       false
   3 │          3       3.0        3.0       false

inspection PoD: prior checks

link to a shiny app

inspection PoD: prior checks

inspection PoD: sampling

Code
library(cmdstanr)

pod_model <- cmdstan_model(stan_file = "pod.stan")
pod_model$format()
data {
  int<lower=0> n_trials; // number of observations
  vector[n_trials] depth; // damage depth
  array[n_trials] int<lower=0, upper=1> indication; // did the inspection find the damage?
}
parameters {
  real alpha; // log-odds intercept
  real beta_depth; // log-odds coefficient for depth
}
model {
  // priors
  alpha ~ normal(-6, sqrt(2));
  beta_depth ~ normal(1.5, sqrt(0.5));
  
  // likelihood
  indication ~ bernoulli_logit(alpha + beta_depth * depth);
}
generated quantities {
  vector[n_trials] log_lik;
  vector[n_trials] p;
  
  for (i in 1 : n_trials) {
    p[i] = inv_logit(alpha + beta_depth * depth[i]);
    log_lik[i] = bernoulli_logit_lpmf(indication[i] | alpha
                                                      + beta_depth * depth[i]);
  }
}
Code
import cmdstanpy

pod_model = cmdstanpy.CmdStanModel(stan_file="pod.stan")
INFO:cmdstanpy:found newer exe file, not recompiling
Code
stan_code = pod_model.code()

from pygments import highlight
from pygments.lexers import StanLexer
from pygments.formatters import NullFormatter

formatted_stan_code = highlight(stan_code, StanLexer(), NullFormatter())

print(formatted_stan_code)
data {
  int <lower = 0> n_trials;                     // number of observations
  vector [n_trials] depth;                             // damage depth
  array [n_trials] int<lower=0, upper=1> indication;   // did the inspection find the damage?
}

parameters {
  real alpha;       // log-odds intercept
  real beta_depth;  // log-odds coefficient for depth
}

model {
  // priors
  alpha ~ normal(-6, sqrt(2));
  beta_depth ~ normal(1.5, sqrt(0.5));
  
  // likelihood
  indication ~ bernoulli_logit(alpha + beta_depth * depth);
}

generated quantities {
  vector[n_trials] log_lik;
  vector[n_trials] p;
  
  for (i in 1:n_trials) {
    p[i] = inv_logit(alpha + beta_depth * depth[i]);
    log_lik[i] = bernoulli_logit_lpmf(indication[i] | alpha + beta_depth * depth[i]);
  }
}
Code
using Turing
using StatsFuns: logistic

@model function pod_model(depth::Vector{Float64}, indication::Vector{Bool})
    # priors
    α ~ Normal(-6, 2)
    β_depth ~ Normal(1.5, 0.5)
    
    # likelihood
    for i in 1:length(indication)
        logit_p = α + β_depth * depth[i]
        indication[i] ~ Bernoulli(logistic(logit_p))
    end
    
end

inspection PoD: sampling

we can specify some additional arguments to the sample method:

Code
model_data <- list(
  n_trials = nrow(poi_data),
  depth = poi_data$depth_mm,
  indication = poi_data$indication |> as.integer()
)

pod_fit <- pod_model$sample(
  data = model_data,
  chains = 4, 
  parallel_chains = parallel::detectCores(),
  iter_warmup = 2000,
  iter_sampling = 2000,
  seed = 231123
)
Running MCMC with 4 chains, at most 16 in parallel...

Chain 1 Iteration:    1 / 4000 [  0%]  (Warmup) 
Chain 1 Iteration:  100 / 4000 [  2%]  (Warmup) 
Chain 1 Iteration:  200 / 4000 [  5%]  (Warmup) 
Chain 1 Iteration:  300 / 4000 [  7%]  (Warmup) 
Chain 1 Iteration:  400 / 4000 [ 10%]  (Warmup) 
Chain 1 Iteration:  500 / 4000 [ 12%]  (Warmup) 
Chain 1 Iteration:  600 / 4000 [ 15%]  (Warmup) 
Chain 1 Iteration:  700 / 4000 [ 17%]  (Warmup) 
Chain 1 Iteration:  800 / 4000 [ 20%]  (Warmup) 
Chain 1 Iteration:  900 / 4000 [ 22%]  (Warmup) 
Chain 1 Iteration: 1000 / 4000 [ 25%]  (Warmup) 
Chain 1 Iteration: 1100 / 4000 [ 27%]  (Warmup) 
Chain 1 Iteration: 1200 / 4000 [ 30%]  (Warmup) 
Chain 1 Iteration: 1300 / 4000 [ 32%]  (Warmup) 
Chain 1 Iteration: 1400 / 4000 [ 35%]  (Warmup) 
Chain 1 Iteration: 1500 / 4000 [ 37%]  (Warmup) 
Chain 1 Iteration: 1600 / 4000 [ 40%]  (Warmup) 
Chain 1 Iteration: 1700 / 4000 [ 42%]  (Warmup) 
Chain 1 Iteration: 1800 / 4000 [ 45%]  (Warmup) 
Chain 1 Iteration: 1900 / 4000 [ 47%]  (Warmup) 
Chain 1 Iteration: 2000 / 4000 [ 50%]  (Warmup) 
Chain 1 Iteration: 2001 / 4000 [ 50%]  (Sampling) 
Chain 1 Iteration: 2100 / 4000 [ 52%]  (Sampling) 
Chain 2 Iteration:    1 / 4000 [  0%]  (Warmup) 
Chain 2 Iteration:  100 / 4000 [  2%]  (Warmup) 
Chain 2 Iteration:  200 / 4000 [  5%]  (Warmup) 
Chain 2 Iteration:  300 / 4000 [  7%]  (Warmup) 
Chain 2 Iteration:  400 / 4000 [ 10%]  (Warmup) 
Chain 2 Iteration:  500 / 4000 [ 12%]  (Warmup) 
Chain 2 Iteration:  600 / 4000 [ 15%]  (Warmup) 
Chain 2 Iteration:  700 / 4000 [ 17%]  (Warmup) 
Chain 2 Iteration:  800 / 4000 [ 20%]  (Warmup) 
Chain 2 Iteration:  900 / 4000 [ 22%]  (Warmup) 
Chain 2 Iteration: 1000 / 4000 [ 25%]  (Warmup) 
Chain 2 Iteration: 1100 / 4000 [ 27%]  (Warmup) 
Chain 2 Iteration: 1200 / 4000 [ 30%]  (Warmup) 
Chain 2 Iteration: 1300 / 4000 [ 32%]  (Warmup) 
Chain 2 Iteration: 1400 / 4000 [ 35%]  (Warmup) 
Chain 2 Iteration: 1500 / 4000 [ 37%]  (Warmup) 
Chain 2 Iteration: 1600 / 4000 [ 40%]  (Warmup) 
Chain 2 Iteration: 1700 / 4000 [ 42%]  (Warmup) 
Chain 2 Iteration: 1800 / 4000 [ 45%]  (Warmup) 
Chain 2 Iteration: 1900 / 4000 [ 47%]  (Warmup) 
Chain 2 Iteration: 2000 / 4000 [ 50%]  (Warmup) 
Chain 2 Iteration: 2001 / 4000 [ 50%]  (Sampling) 
Chain 2 Iteration: 2100 / 4000 [ 52%]  (Sampling) 
Chain 2 Iteration: 2200 / 4000 [ 55%]  (Sampling) 
Chain 2 Iteration: 2300 / 4000 [ 57%]  (Sampling) 
Chain 2 Iteration: 2400 / 4000 [ 60%]  (Sampling) 
Chain 2 Iteration: 2500 / 4000 [ 62%]  (Sampling) 
Chain 2 Iteration: 2600 / 4000 [ 65%]  (Sampling) 
Chain 2 Iteration: 2700 / 4000 [ 67%]  (Sampling) 
Chain 3 Iteration:    1 / 4000 [  0%]  (Warmup) 
Chain 3 Iteration:  100 / 4000 [  2%]  (Warmup) 
Chain 3 Iteration:  200 / 4000 [  5%]  (Warmup) 
Chain 3 Iteration:  300 / 4000 [  7%]  (Warmup) 
Chain 3 Iteration:  400 / 4000 [ 10%]  (Warmup) 
Chain 3 Iteration:  500 / 4000 [ 12%]  (Warmup) 
Chain 3 Iteration:  600 / 4000 [ 15%]  (Warmup) 
Chain 3 Iteration:  700 / 4000 [ 17%]  (Warmup) 
Chain 3 Iteration:  800 / 4000 [ 20%]  (Warmup) 
Chain 3 Iteration:  900 / 4000 [ 22%]  (Warmup) 
Chain 3 Iteration: 1000 / 4000 [ 25%]  (Warmup) 
Chain 3 Iteration: 1100 / 4000 [ 27%]  (Warmup) 
Chain 3 Iteration: 1200 / 4000 [ 30%]  (Warmup) 
Chain 3 Iteration: 1300 / 4000 [ 32%]  (Warmup) 
Chain 3 Iteration: 1400 / 4000 [ 35%]  (Warmup) 
Chain 3 Iteration: 1500 / 4000 [ 37%]  (Warmup) 
Chain 3 Iteration: 1600 / 4000 [ 40%]  (Warmup) 
Chain 3 Iteration: 1700 / 4000 [ 42%]  (Warmup) 
Chain 3 Iteration: 1800 / 4000 [ 45%]  (Warmup) 
Chain 3 Iteration: 1900 / 4000 [ 47%]  (Warmup) 
Chain 3 Iteration: 2000 / 4000 [ 50%]  (Warmup) 
Chain 3 Iteration: 2001 / 4000 [ 50%]  (Sampling) 
Chain 3 Iteration: 2100 / 4000 [ 52%]  (Sampling) 
Chain 3 Iteration: 2200 / 4000 [ 55%]  (Sampling) 
Chain 3 Iteration: 2300 / 4000 [ 57%]  (Sampling) 
Chain 3 Iteration: 2400 / 4000 [ 60%]  (Sampling) 
Chain 3 Iteration: 2500 / 4000 [ 62%]  (Sampling) 
Chain 3 Iteration: 2600 / 4000 [ 65%]  (Sampling) 
Chain 3 Iteration: 2700 / 4000 [ 67%]  (Sampling) 
Chain 4 Iteration:    1 / 4000 [  0%]  (Warmup) 
Chain 4 Iteration:  100 / 4000 [  2%]  (Warmup) 
Chain 4 Iteration:  200 / 4000 [  5%]  (Warmup) 
Chain 4 Iteration:  300 / 4000 [  7%]  (Warmup) 
Chain 4 Iteration:  400 / 4000 [ 10%]  (Warmup) 
Chain 4 Iteration:  500 / 4000 [ 12%]  (Warmup) 
Chain 4 Iteration:  600 / 4000 [ 15%]  (Warmup) 
Chain 4 Iteration:  700 / 4000 [ 17%]  (Warmup) 
Chain 4 Iteration:  800 / 4000 [ 20%]  (Warmup) 
Chain 4 Iteration:  900 / 4000 [ 22%]  (Warmup) 
Chain 4 Iteration: 1000 / 4000 [ 25%]  (Warmup) 
Chain 4 Iteration: 1100 / 4000 [ 27%]  (Warmup) 
Chain 4 Iteration: 1200 / 4000 [ 30%]  (Warmup) 
Chain 4 Iteration: 1300 / 4000 [ 32%]  (Warmup) 
Chain 4 Iteration: 1400 / 4000 [ 35%]  (Warmup) 
Chain 4 Iteration: 1500 / 4000 [ 37%]  (Warmup) 
Chain 4 Iteration: 1600 / 4000 [ 40%]  (Warmup) 
Chain 4 Iteration: 1700 / 4000 [ 42%]  (Warmup) 
Chain 4 Iteration: 1800 / 4000 [ 45%]  (Warmup) 
Chain 4 Iteration: 1900 / 4000 [ 47%]  (Warmup) 
Chain 4 Iteration: 2000 / 4000 [ 50%]  (Warmup) 
Chain 4 Iteration: 2001 / 4000 [ 50%]  (Sampling) 
Chain 4 Iteration: 2100 / 4000 [ 52%]  (Sampling) 
Chain 4 Iteration: 2200 / 4000 [ 55%]  (Sampling) 
Chain 4 Iteration: 2300 / 4000 [ 57%]  (Sampling) 
Chain 4 Iteration: 2400 / 4000 [ 60%]  (Sampling) 
Chain 4 Iteration: 2500 / 4000 [ 62%]  (Sampling) 
Chain 4 Iteration: 2600 / 4000 [ 65%]  (Sampling) 
Chain 4 Iteration: 2700 / 4000 [ 67%]  (Sampling) 
Chain 1 Iteration: 2200 / 4000 [ 55%]  (Sampling) 
Chain 1 Iteration: 2300 / 4000 [ 57%]  (Sampling) 
Chain 1 Iteration: 2400 / 4000 [ 60%]  (Sampling) 
Chain 1 Iteration: 2500 / 4000 [ 62%]  (Sampling) 
Chain 1 Iteration: 2600 / 4000 [ 65%]  (Sampling) 
Chain 1 Iteration: 2700 / 4000 [ 67%]  (Sampling) 
Chain 1 Iteration: 2800 / 4000 [ 70%]  (Sampling) 
Chain 1 Iteration: 2900 / 4000 [ 72%]  (Sampling) 
Chain 1 Iteration: 3000 / 4000 [ 75%]  (Sampling) 
Chain 1 Iteration: 3100 / 4000 [ 77%]  (Sampling) 
Chain 1 Iteration: 3200 / 4000 [ 80%]  (Sampling) 
Chain 1 Iteration: 3300 / 4000 [ 82%]  (Sampling) 
Chain 1 Iteration: 3400 / 4000 [ 85%]  (Sampling) 
Chain 1 Iteration: 3500 / 4000 [ 87%]  (Sampling) 
Chain 1 Iteration: 3600 / 4000 [ 90%]  (Sampling) 
Chain 2 Iteration: 2800 / 4000 [ 70%]  (Sampling) 
Chain 2 Iteration: 2900 / 4000 [ 72%]  (Sampling) 
Chain 2 Iteration: 3000 / 4000 [ 75%]  (Sampling) 
Chain 2 Iteration: 3100 / 4000 [ 77%]  (Sampling) 
Chain 2 Iteration: 3200 / 4000 [ 80%]  (Sampling) 
Chain 2 Iteration: 3300 / 4000 [ 82%]  (Sampling) 
Chain 2 Iteration: 3400 / 4000 [ 85%]  (Sampling) 
Chain 2 Iteration: 3500 / 4000 [ 87%]  (Sampling) 
Chain 2 Iteration: 3600 / 4000 [ 90%]  (Sampling) 
Chain 3 Iteration: 2800 / 4000 [ 70%]  (Sampling) 
Chain 3 Iteration: 2900 / 4000 [ 72%]  (Sampling) 
Chain 3 Iteration: 3000 / 4000 [ 75%]  (Sampling) 
Chain 3 Iteration: 3100 / 4000 [ 77%]  (Sampling) 
Chain 3 Iteration: 3200 / 4000 [ 80%]  (Sampling) 
Chain 3 Iteration: 3300 / 4000 [ 82%]  (Sampling) 
Chain 3 Iteration: 3400 / 4000 [ 85%]  (Sampling) 
Chain 4 Iteration: 2800 / 4000 [ 70%]  (Sampling) 
Chain 4 Iteration: 2900 / 4000 [ 72%]  (Sampling) 
Chain 4 Iteration: 3000 / 4000 [ 75%]  (Sampling) 
Chain 4 Iteration: 3100 / 4000 [ 77%]  (Sampling) 
Chain 4 Iteration: 3200 / 4000 [ 80%]  (Sampling) 
Chain 4 Iteration: 3300 / 4000 [ 82%]  (Sampling) 
Chain 4 Iteration: 3400 / 4000 [ 85%]  (Sampling) 
Chain 1 Iteration: 3700 / 4000 [ 92%]  (Sampling) 
Chain 1 Iteration: 3800 / 4000 [ 95%]  (Sampling) 
Chain 1 Iteration: 3900 / 4000 [ 97%]  (Sampling) 
Chain 1 Iteration: 4000 / 4000 [100%]  (Sampling) 
Chain 2 Iteration: 3700 / 4000 [ 92%]  (Sampling) 
Chain 2 Iteration: 3800 / 4000 [ 95%]  (Sampling) 
Chain 2 Iteration: 3900 / 4000 [ 97%]  (Sampling) 
Chain 2 Iteration: 4000 / 4000 [100%]  (Sampling) 
Chain 3 Iteration: 3500 / 4000 [ 87%]  (Sampling) 
Chain 3 Iteration: 3600 / 4000 [ 90%]  (Sampling) 
Chain 3 Iteration: 3700 / 4000 [ 92%]  (Sampling) 
Chain 3 Iteration: 3800 / 4000 [ 95%]  (Sampling) 
Chain 3 Iteration: 3900 / 4000 [ 97%]  (Sampling) 
Chain 3 Iteration: 4000 / 4000 [100%]  (Sampling) 
Chain 4 Iteration: 3500 / 4000 [ 87%]  (Sampling) 
Chain 4 Iteration: 3600 / 4000 [ 90%]  (Sampling) 
Chain 4 Iteration: 3700 / 4000 [ 92%]  (Sampling) 
Chain 4 Iteration: 3800 / 4000 [ 95%]  (Sampling) 
Chain 4 Iteration: 3900 / 4000 [ 97%]  (Sampling) 
Chain 4 Iteration: 4000 / 4000 [100%]  (Sampling) 
Chain 1 finished in 0.5 seconds.
Chain 2 finished in 0.5 seconds.
Chain 3 finished in 0.5 seconds.
Chain 4 finished in 0.5 seconds.

All 4 chains finished successfully.
Mean chain execution time: 0.5 seconds.
Total execution time: 0.6 seconds.
Code
import multiprocessing

model_data = {
    "n_trials": len(poi_data),
    "depth": poi_data["depth_mm"].to_list(),
    "indication": [int(x) for x in poi_data["indication"].to_list()]
}

pod_fit = pod_model.sample(
  data = model_data, 
  chains = 4, 
  parallel_chains = multiprocessing.cpu_count(),
  iter_warmup = 2000, 
  iter_sampling = 2000, 
  seed = 231123
  )
                                                                                                                                                                                                                                                                                                                                

INFO:cmdstanpy:Requested 16 parallel_chains but only 4 required, will run all chains in parallel.
INFO:cmdstanpy:CmdStan start processing

chain 1 |          | 00:00 Status

chain 2 |          | 00:00 Status


chain 3 |          | 00:00 Status



chain 4 |          | 00:00 Status



chain 4 |##1       | 00:00 Iteration:  700 / 4000 [ 17%]  (Warmup)
chain 1 |##3       | 00:00 Iteration:  800 / 4000 [ 20%]  (Warmup)

chain 2 |##3       | 00:00 Iteration:  800 / 4000 [ 20%]  (Warmup)


chain 3 |##1       | 00:00 Iteration:  700 / 4000 [ 17%]  (Warmup)

chain 2 |#####9    | 00:00 Iteration: 2200 / 4000 [ 55%]  (Sampling)
chain 1 |#####9    | 00:00 Iteration: 2200 / 4000 [ 55%]  (Sampling)



chain 4 |#####9    | 00:00 Iteration: 2200 / 4000 [ 55%]  (Sampling)


chain 3 |#####9    | 00:00 Iteration: 2200 / 4000 [ 55%]  (Sampling)


chain 3 |########8 | 00:00 Iteration: 3400 / 4000 [ 85%]  (Sampling)



chain 4 |########8 | 00:00 Iteration: 3400 / 4000 [ 85%]  (Sampling)

chain 2 |######### | 00:00 Iteration: 3500 / 4000 [ 87%]  (Sampling)
chain 1 |######### | 00:00 Iteration: 3500 / 4000 [ 87%]  (Sampling)
chain 1 |##########| 00:00 Sampling completed                       

chain 2 |##########| 00:00 Sampling completed                       

chain 3 |##########| 00:00 Sampling completed                       

chain 4 |##########| 00:00 Sampling completed                       
INFO:cmdstanpy:CmdStan done processing.
Code
using Random

n_draws = 2_000; n_chains = 4

pod_fit = pod_model(poi_data.depth_mm, poi_data.indication) |>
  model -> sample(MersenneTwister(231123), model, NUTS(), MCMCThreads(), n_draws, n_chains)

inspection PoD: sampling

things can go wrong

inspection PoD: sampling

as well as graphical checks (suchas traceplots) some metrics have also been developed:

  • R-hat: a measure of convergence. Considers the variance within and between chains. We want this to be close to \(1.0\), which would indicate that all chains have converged to the same distribution.

  • ESS: a measure of autocorrelation. Approximate number of equivalent independent samples. We want this to be high (comparable to the number of samples).

inspection PoD: sampling

Code
pod_fit$summary()
# A tibble: 203 × 10
   variable      mean  median     sd    mad      q5      q95  rhat ess_bulk
   <chr>        <dbl>   <dbl>  <dbl>  <dbl>   <dbl>    <dbl> <dbl>    <dbl>
 1 lp__       -52.8   -52.5   0.992  0.719  -54.8   -51.8     1.00    2620.
 2 alpha       -2.73   -2.72  0.575  0.580   -3.69   -1.80    1.00    1703.
 3 beta_depth   0.677   0.674 0.120  0.120    0.486   0.882   1.00    1696.
 4 log_lik[1]  -0.132  -0.122 0.0578 0.0532  -0.242  -0.0562  1.00    1804.
 5 log_lik[2]  -0.236  -0.226 0.0785 0.0760  -0.379  -0.125   1.00    2064.
 6 log_lik[3]  -0.413  -0.403 0.102  0.101   -0.592  -0.261   1.00    2854.
 7 log_lik[4]  -0.690  -0.681 0.133  0.129   -0.927  -0.489   1.00    5929.
 8 log_lik[5]  -1.08   -1.07  0.186  0.179   -1.42   -0.796   1.00    8020.
 9 log_lik[6]  -0.244  -0.237 0.0723 0.0709  -0.374  -0.137   1.00    5085.
10 log_lik[7]  -2.14   -2.12  0.376  0.378   -2.80   -1.57    1.00    3391.
# ℹ 193 more rows
# ℹ 1 more variable: ess_tail <dbl>
Code
pod_fit.summary()
             Mean     MCSE  StdDev     5%  ...     95%   N_Eff  N_Eff/s  R_hat
name                                       ...                                
lp__       -53.00  0.01900   0.980 -55.00  ... -52.000  2600.0   1800.0    1.0
alpha       -2.70  0.01400   0.560  -3.60  ...  -1.800  1600.0   1100.0    1.0
beta_depth   0.67  0.00290   0.120   0.49  ...   0.870  1600.0   1100.0    1.0
log_lik[1]  -0.13  0.00140   0.058  -0.24  ...  -0.058  1661.0   1149.0    1.0
log_lik[2]  -0.24  0.00170   0.079  -0.38  ...  -0.130  2016.0   1394.0    1.0
...           ...      ...     ...    ...  ...     ...     ...      ...    ...
p[96]        0.78  0.00076   0.056   0.68  ...   0.870  5435.0   3758.0    1.0
p[97]        0.87  0.00075   0.045   0.79  ...   0.940  3666.0   2536.0    1.0
p[98]        0.93  0.00064   0.034   0.86  ...   0.970  2753.0   1904.0    1.0
p[99]        0.96  0.00049   0.024   0.91  ...   0.990  2320.0   1605.0    1.0
p[100]       0.98  0.00035   0.016   0.95  ...   0.990  2110.0   1459.0    1.0

[203 rows x 9 columns]
Code
pod_fit
Chains MCMC chain (2000×14×4 Array{Float64, 3}):

Iterations        = 1001:1:3000
Number of chains  = 4
Samples per chain = 2000
Wall duration     = 14.49 seconds
Compute duration  = 11.37 seconds
parameters        = α, β_depth
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
  parameters      mean       std      mcse    ess_bulk    ess_tail      rhat   ⋯
      Symbol   Float64   Float64   Float64     Float64     Float64   Float64   ⋯

           α   -2.5674    0.5637    0.0116   2357.3718   2568.8054    1.0019   ⋯
     β_depth    0.6516    0.1160    0.0024   2358.7353   2623.2594    1.0016   ⋯
                                                                1 column omitted

Quantiles
  parameters      2.5%     25.0%     50.0%     75.0%     97.5%
      Symbol   Float64   Float64   Float64   Float64   Float64

           α   -3.7514   -2.9249   -2.5506   -2.1968   -1.4952
     β_depth    0.4361    0.5730    0.6466    0.7265    0.8958

scoring models

it worked, but is it good?

scoring models

  • cross-validation:
    • out of sample log likelihoods
    • K-Fold CV
    • LOO CV
  • approximations:
    • WAIC
    • PSIS-LOO

scoring models

The larger (or less negative) the value of expected log pointwise predictive density (elpd), the better predictive performance of the model.

Code
pod_fit$loo(variables = "log_lik")

Computed from 8000 by 100 log-likelihood matrix

         Estimate   SE
elpd_loo    -50.5  6.6
p_loo         2.3  0.5
looic       101.0 13.3
------
Monte Carlo SE of elpd_loo is 0.0.

All Pareto k estimates are good (k < 0.5).
See help('pareto-k-diagnostic') for details.
Code

# ...
Code
using ParetoSmooth

pod_model(poi_data.depth_mm, poi_data.indication) |>
  model -> psis_loo(model, pod_fit)

scoring models

can we do any better if we include include damage length?

Code
pod_2d_model <- cmdstan_model(stan_file = "pod_2d.stan")

pod_2d_model$format()
data {
  int<lower=0> n_trials; // number of observations
  vector[n_trials] depth; // damage depth
  vector[n_trials] length; // damage length
  array[n_trials] int<lower=0, upper=1> indication; // did the inspection find the damage?
}
parameters {
  real alpha; // log-odds intercept
  real beta_depth; // log-odds coefficient for depth
  real beta_length; // log-odds coefficient for length
}
model {
  // priors
  alpha ~ normal(-6, 2);
  beta_depth ~ normal(1.5, 0.5);
  beta_length ~ normal(1.5, 0.5);
  
  // likelihood
  indication ~ bernoulli_logit(alpha + beta_depth * depth
                               + beta_length * length);
}
generated quantities {
  vector[n_trials] log_lik;
  vector[n_trials] p;
  
  for (i in 1 : n_trials) {
    p[i] = inv_logit(alpha + beta_depth * depth[i] + beta_length * length[i]);
    log_lik[i] = bernoulli_logit_lpmf(indication[i] | alpha
                                                      + beta_depth * depth[i]
                                                      + beta_length
                                                        * length[i]);
  }
}
Running MCMC with 4 chains, at most 16 in parallel...

Chain 1 Iteration:    1 / 4000 [  0%]  (Warmup) 
Chain 1 Iteration:  100 / 4000 [  2%]  (Warmup) 
Chain 1 Iteration:  200 / 4000 [  5%]  (Warmup) 
Chain 1 Iteration:  300 / 4000 [  7%]  (Warmup) 
Chain 1 Iteration:  400 / 4000 [ 10%]  (Warmup) 
Chain 1 Iteration:  500 / 4000 [ 12%]  (Warmup) 
Chain 1 Iteration:  600 / 4000 [ 15%]  (Warmup) 
Chain 2 Iteration:    1 / 4000 [  0%]  (Warmup) 
Chain 2 Iteration:  100 / 4000 [  2%]  (Warmup) 
Chain 2 Iteration:  200 / 4000 [  5%]  (Warmup) 
Chain 2 Iteration:  300 / 4000 [  7%]  (Warmup) 
Chain 2 Iteration:  400 / 4000 [ 10%]  (Warmup) 
Chain 2 Iteration:  500 / 4000 [ 12%]  (Warmup) 
Chain 3 Iteration:    1 / 4000 [  0%]  (Warmup) 
Chain 3 Iteration:  100 / 4000 [  2%]  (Warmup) 
Chain 3 Iteration:  200 / 4000 [  5%]  (Warmup) 
Chain 3 Iteration:  300 / 4000 [  7%]  (Warmup) 
Chain 3 Iteration:  400 / 4000 [ 10%]  (Warmup) 
Chain 3 Iteration:  500 / 4000 [ 12%]  (Warmup) 
Chain 3 Iteration:  600 / 4000 [ 15%]  (Warmup) 
Chain 3 Iteration:  700 / 4000 [ 17%]  (Warmup) 
Chain 3 Iteration:  800 / 4000 [ 20%]  (Warmup) 
Chain 4 Iteration:    1 / 4000 [  0%]  (Warmup) 
Chain 4 Iteration:  100 / 4000 [  2%]  (Warmup) 
Chain 4 Iteration:  200 / 4000 [  5%]  (Warmup) 
Chain 4 Iteration:  300 / 4000 [  7%]  (Warmup) 
Chain 4 Iteration:  400 / 4000 [ 10%]  (Warmup) 
Chain 4 Iteration:  500 / 4000 [ 12%]  (Warmup) 
Chain 4 Iteration:  600 / 4000 [ 15%]  (Warmup) 
Chain 4 Iteration:  700 / 4000 [ 17%]  (Warmup) 
Chain 4 Iteration:  800 / 4000 [ 20%]  (Warmup) 
Chain 4 Iteration:  900 / 4000 [ 22%]  (Warmup) 
Chain 1 Iteration:  700 / 4000 [ 17%]  (Warmup) 
Chain 1 Iteration:  800 / 4000 [ 20%]  (Warmup) 
Chain 1 Iteration:  900 / 4000 [ 22%]  (Warmup) 
Chain 1 Iteration: 1000 / 4000 [ 25%]  (Warmup) 
Chain 1 Iteration: 1100 / 4000 [ 27%]  (Warmup) 
Chain 1 Iteration: 1200 / 4000 [ 30%]  (Warmup) 
Chain 1 Iteration: 1300 / 4000 [ 32%]  (Warmup) 
Chain 1 Iteration: 1400 / 4000 [ 35%]  (Warmup) 
Chain 1 Iteration: 1500 / 4000 [ 37%]  (Warmup) 
Chain 1 Iteration: 1600 / 4000 [ 40%]  (Warmup) 
Chain 1 Iteration: 1700 / 4000 [ 42%]  (Warmup) 
Chain 1 Iteration: 1800 / 4000 [ 45%]  (Warmup) 
Chain 1 Iteration: 1900 / 4000 [ 47%]  (Warmup) 
Chain 1 Iteration: 2000 / 4000 [ 50%]  (Warmup) 
Chain 1 Iteration: 2001 / 4000 [ 50%]  (Sampling) 
Chain 2 Iteration:  600 / 4000 [ 15%]  (Warmup) 
Chain 2 Iteration:  700 / 4000 [ 17%]  (Warmup) 
Chain 2 Iteration:  800 / 4000 [ 20%]  (Warmup) 
Chain 2 Iteration:  900 / 4000 [ 22%]  (Warmup) 
Chain 2 Iteration: 1000 / 4000 [ 25%]  (Warmup) 
Chain 2 Iteration: 1100 / 4000 [ 27%]  (Warmup) 
Chain 2 Iteration: 1200 / 4000 [ 30%]  (Warmup) 
Chain 2 Iteration: 1300 / 4000 [ 32%]  (Warmup) 
Chain 2 Iteration: 1400 / 4000 [ 35%]  (Warmup) 
Chain 2 Iteration: 1500 / 4000 [ 37%]  (Warmup) 
Chain 2 Iteration: 1600 / 4000 [ 40%]  (Warmup) 
Chain 2 Iteration: 1700 / 4000 [ 42%]  (Warmup) 
Chain 2 Iteration: 1800 / 4000 [ 45%]  (Warmup) 
Chain 2 Iteration: 1900 / 4000 [ 47%]  (Warmup) 
Chain 3 Iteration:  900 / 4000 [ 22%]  (Warmup) 
Chain 3 Iteration: 1000 / 4000 [ 25%]  (Warmup) 
Chain 3 Iteration: 1100 / 4000 [ 27%]  (Warmup) 
Chain 3 Iteration: 1200 / 4000 [ 30%]  (Warmup) 
Chain 3 Iteration: 1300 / 4000 [ 32%]  (Warmup) 
Chain 3 Iteration: 1400 / 4000 [ 35%]  (Warmup) 
Chain 3 Iteration: 1500 / 4000 [ 37%]  (Warmup) 
Chain 3 Iteration: 1600 / 4000 [ 40%]  (Warmup) 
Chain 3 Iteration: 1700 / 4000 [ 42%]  (Warmup) 
Chain 3 Iteration: 1800 / 4000 [ 45%]  (Warmup) 
Chain 3 Iteration: 1900 / 4000 [ 47%]  (Warmup) 
Chain 3 Iteration: 2000 / 4000 [ 50%]  (Warmup) 
Chain 3 Iteration: 2001 / 4000 [ 50%]  (Sampling) 
Chain 4 Iteration: 1000 / 4000 [ 25%]  (Warmup) 
Chain 4 Iteration: 1100 / 4000 [ 27%]  (Warmup) 
Chain 4 Iteration: 1200 / 4000 [ 30%]  (Warmup) 
Chain 4 Iteration: 1300 / 4000 [ 32%]  (Warmup) 
Chain 4 Iteration: 1400 / 4000 [ 35%]  (Warmup) 
Chain 4 Iteration: 1500 / 4000 [ 37%]  (Warmup) 
Chain 4 Iteration: 1600 / 4000 [ 40%]  (Warmup) 
Chain 4 Iteration: 1700 / 4000 [ 42%]  (Warmup) 
Chain 4 Iteration: 1800 / 4000 [ 45%]  (Warmup) 
Chain 4 Iteration: 1900 / 4000 [ 47%]  (Warmup) 
Chain 4 Iteration: 2000 / 4000 [ 50%]  (Warmup) 
Chain 4 Iteration: 2001 / 4000 [ 50%]  (Sampling) 
Chain 1 Iteration: 2100 / 4000 [ 52%]  (Sampling) 
Chain 1 Iteration: 2200 / 4000 [ 55%]  (Sampling) 
Chain 1 Iteration: 2300 / 4000 [ 57%]  (Sampling) 
Chain 1 Iteration: 2400 / 4000 [ 60%]  (Sampling) 
Chain 2 Iteration: 2000 / 4000 [ 50%]  (Warmup) 
Chain 2 Iteration: 2001 / 4000 [ 50%]  (Sampling) 
Chain 2 Iteration: 2100 / 4000 [ 52%]  (Sampling) 
Chain 2 Iteration: 2200 / 4000 [ 55%]  (Sampling) 
Chain 2 Iteration: 2300 / 4000 [ 57%]  (Sampling) 
Chain 3 Iteration: 2100 / 4000 [ 52%]  (Sampling) 
Chain 3 Iteration: 2200 / 4000 [ 55%]  (Sampling) 
Chain 3 Iteration: 2300 / 4000 [ 57%]  (Sampling) 
Chain 3 Iteration: 2400 / 4000 [ 60%]  (Sampling) 
Chain 4 Iteration: 2100 / 4000 [ 52%]  (Sampling) 
Chain 4 Iteration: 2200 / 4000 [ 55%]  (Sampling) 
Chain 4 Iteration: 2300 / 4000 [ 57%]  (Sampling) 
Chain 4 Iteration: 2400 / 4000 [ 60%]  (Sampling) 
Chain 1 Iteration: 2500 / 4000 [ 62%]  (Sampling) 
Chain 1 Iteration: 2600 / 4000 [ 65%]  (Sampling) 
Chain 1 Iteration: 2700 / 4000 [ 67%]  (Sampling) 
Chain 1 Iteration: 2800 / 4000 [ 70%]  (Sampling) 
Chain 2 Iteration: 2400 / 4000 [ 60%]  (Sampling) 
Chain 2 Iteration: 2500 / 4000 [ 62%]  (Sampling) 
Chain 2 Iteration: 2600 / 4000 [ 65%]  (Sampling) 
Chain 2 Iteration: 2700 / 4000 [ 67%]  (Sampling) 
Chain 3 Iteration: 2500 / 4000 [ 62%]  (Sampling) 
Chain 3 Iteration: 2600 / 4000 [ 65%]  (Sampling) 
Chain 3 Iteration: 2700 / 4000 [ 67%]  (Sampling) 
Chain 3 Iteration: 2800 / 4000 [ 70%]  (Sampling) 
Chain 4 Iteration: 2500 / 4000 [ 62%]  (Sampling) 
Chain 4 Iteration: 2600 / 4000 [ 65%]  (Sampling) 
Chain 4 Iteration: 2700 / 4000 [ 67%]  (Sampling) 
Chain 4 Iteration: 2800 / 4000 [ 70%]  (Sampling) 
Chain 1 Iteration: 2900 / 4000 [ 72%]  (Sampling) 
Chain 1 Iteration: 3000 / 4000 [ 75%]  (Sampling) 
Chain 1 Iteration: 3100 / 4000 [ 77%]  (Sampling) 
Chain 1 Iteration: 3200 / 4000 [ 80%]  (Sampling) 
Chain 2 Iteration: 2800 / 4000 [ 70%]  (Sampling) 
Chain 2 Iteration: 2900 / 4000 [ 72%]  (Sampling) 
Chain 2 Iteration: 3000 / 4000 [ 75%]  (Sampling) 
Chain 2 Iteration: 3100 / 4000 [ 77%]  (Sampling) 
Chain 3 Iteration: 2900 / 4000 [ 72%]  (Sampling) 
Chain 3 Iteration: 3000 / 4000 [ 75%]  (Sampling) 
Chain 3 Iteration: 3100 / 4000 [ 77%]  (Sampling) 
Chain 3 Iteration: 3200 / 4000 [ 80%]  (Sampling) 
Chain 4 Iteration: 2900 / 4000 [ 72%]  (Sampling) 
Chain 4 Iteration: 3000 / 4000 [ 75%]  (Sampling) 
Chain 4 Iteration: 3100 / 4000 [ 77%]  (Sampling) 
Chain 4 Iteration: 3200 / 4000 [ 80%]  (Sampling) 
Chain 4 Iteration: 3300 / 4000 [ 82%]  (Sampling) 
Chain 1 Iteration: 3300 / 4000 [ 82%]  (Sampling) 
Chain 1 Iteration: 3400 / 4000 [ 85%]  (Sampling) 
Chain 1 Iteration: 3500 / 4000 [ 87%]  (Sampling) 
Chain 1 Iteration: 3600 / 4000 [ 90%]  (Sampling) 
Chain 2 Iteration: 3200 / 4000 [ 80%]  (Sampling) 
Chain 2 Iteration: 3300 / 4000 [ 82%]  (Sampling) 
Chain 2 Iteration: 3400 / 4000 [ 85%]  (Sampling) 
Chain 2 Iteration: 3500 / 4000 [ 87%]  (Sampling) 
Chain 3 Iteration: 3300 / 4000 [ 82%]  (Sampling) 
Chain 3 Iteration: 3400 / 4000 [ 85%]  (Sampling) 
Chain 3 Iteration: 3500 / 4000 [ 87%]  (Sampling) 
Chain 3 Iteration: 3600 / 4000 [ 90%]  (Sampling) 
Chain 4 Iteration: 3400 / 4000 [ 85%]  (Sampling) 
Chain 4 Iteration: 3500 / 4000 [ 87%]  (Sampling) 
Chain 4 Iteration: 3600 / 4000 [ 90%]  (Sampling) 
Chain 4 Iteration: 3700 / 4000 [ 92%]  (Sampling) 
Chain 1 Iteration: 3700 / 4000 [ 92%]  (Sampling) 
Chain 1 Iteration: 3800 / 4000 [ 95%]  (Sampling) 
Chain 1 Iteration: 3900 / 4000 [ 97%]  (Sampling) 
Chain 1 Iteration: 4000 / 4000 [100%]  (Sampling) 
Chain 2 Iteration: 3600 / 4000 [ 90%]  (Sampling) 
Chain 2 Iteration: 3700 / 4000 [ 92%]  (Sampling) 
Chain 2 Iteration: 3800 / 4000 [ 95%]  (Sampling) 
Chain 2 Iteration: 3900 / 4000 [ 97%]  (Sampling) 
Chain 3 Iteration: 3700 / 4000 [ 92%]  (Sampling) 
Chain 3 Iteration: 3800 / 4000 [ 95%]  (Sampling) 
Chain 3 Iteration: 3900 / 4000 [ 97%]  (Sampling) 
Chain 4 Iteration: 3800 / 4000 [ 95%]  (Sampling) 
Chain 4 Iteration: 3900 / 4000 [ 97%]  (Sampling) 
Chain 4 Iteration: 4000 / 4000 [100%]  (Sampling) 
Chain 1 finished in 0.8 seconds.
Chain 2 Iteration: 4000 / 4000 [100%]  (Sampling) 
Chain 2 finished in 0.8 seconds.
Chain 3 Iteration: 4000 / 4000 [100%]  (Sampling) 
Chain 3 finished in 0.8 seconds.
Chain 4 finished in 0.8 seconds.

All 4 chains finished successfully.
Mean chain execution time: 0.8 seconds.
Total execution time: 0.9 seconds.

break?